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Weak purifying selection, acting on many linked mutations, may play a major role in 
shaping patterns of molecular evolution in natural populations. Yet efforts to infer these 
effects from DNA sequence data are limited by our incomplete understanding of weak 
selection on local genomic scales. Here, we demonstrate a natural symmetry between 
weak and strong selection, in which the effects of many weakly selected mutations on 
patterns of molecular evolution are equivalent to a smaller number of more strongly 
selected mutations. By introducing a coarse-grained "effective selection coefficient," we 
derive an explicit mapping between weakly selected populations and their strongly se- 
lected counterparts, which allows us to make accurate and efficient predictions across 
the full range of selection strengths. This suggests that an effective selection coefficient 
and effective mutation rate — not an effective population size — is the most accurate 
summary of the effects of selection over locally linked regions. Moreover, this correspon- 
dence places fundamental limits on our ability to resolve the effects of weak selection 
from contemporary sequence data alone. 



Purifying selection maintains important biological 
function by purging deleterious mutations and is thought 
to play a major role in shaping the patterns of molecular 
evolution in many organisms ( Charlesworth 2012). In 



principle, these patterns can provide important informa- 
tion about the selective forces operating within a popu- 
lation, and could be used to disentangle this signal from 



other factors, such as demographic history (Williamson 



et al. 



2005). Yet existing methods are limited by our 
incomplete understanding of purifying selection on local 
genomic scales, where many linked sites are potentially 
selected against. 

The action of selection on neighboring sites creates 
correlations within a genotype that can be difficult to 
disentangle from each other. Early treatments assumed 
that these correlations were essentially equivalent to an 
increase in genetic drift, or a reduction in effective popu- 
lation size, and that the individual sites otherwise evolve 



independently (Charlesworth 2009 Hill and Robertson 



1966). Recent studies of these "Hill- Robertson inter- 



ference" effects have challenged the validity of this as- 



sumption ( Bustamante et al. 12001 Comeron and Kreit 



man } [20021 [Comeron et al.H2008[ [Santiago and Cabal~ 
1998), particularly for the case of weak selection. But 



without a simple alternative, the effective population size 
picture continues to dominate much of our qualitative un- 
derstanding of linked selection and its application to data 
from natural populations. 

Meanwhile, attempts to incorporate linkage more ex- 
plicitly have been limited to the case where the strength 
of purifying selection is strong and the number of dele- 
terious polymorphisms is small. In this regime, corre- 
lations within genotypes are still highly uncertain, but 
the distribution of fitnesses within the population can be 
modeled very precisely. For extremely strong selection. 



this leads to the classic background selection picture, in 
which the apparent size of the population is reduced to 



the size of the least-loaded class (Charlesworth et al. 



1993). More generally, methods based on the structured 



coalescent framework (Kaplan et al.[ 1988) lead to im 



proved (though more complicated) analytical predictions 
(Nicolaisen and Desai [2012[ Walczak et al. 2012), as 



well as a class of extremely efficient backward-time sim- 
ulations that can be used to rapidly calculate any quan- 



1994) 



tity of interest ( Gordo et al. 2002 Hudson and Kaplan 



Yet there is increasing evidence that at these local 
genomic scales, selection is dominated not by a few 
strongly deleterious polymorphisms, but rather by many 
more weakly selected mutations that can segregate in 



the same lineag 


e (Barraclough et al. 


2007 


Bartolome 


and Charlesworth', '2006^ 'Comeron and Kreitman', '2002 
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2011 


Seger et al. 2010| Sub- 


ramanian 


2012 


). In this reg 


lime, the strong-selection 



results break down due to the increased importance of 
stochastic fluctuations, which can carry some deleterious 
alleles to intermediate or high frequencies while driving 
others to extinction. In the extreme case, these fluctu- 
ations can sometimes lead to the extinction of the wild- 
type class and the subsequent fixation of a deleterious al- 
lele — an effect known as Muller's ratchet ( Muller 1964 1 . 
The complexity of these forces has lead to the belief that 
the dynamics of weak selection are of a fundamentally 
different character than strong selection, and that a new 
theoretical picture is required to understand them. Vari- 
ous numerical methods have been devised for this regime, 
but they either become computationally prohibitive for 



more than a few selected sites (Barton and Etheridge 
[2004[ [Barton et aT| [2004[ [Krone and Neuhauser[ [1997 
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FIG. 1 The deterministic prediction for the distribution of 
fitnesses in the population at mutation-selection balance when 
A = IJdjs « 3, and a possible ancestral history for a sample 
of two individuals. 



Neuhauser and Krone 1997 ) or require computation time 



that scales with the size of the population ( O'Fallon et al 



2010 Seger et al. 2010), similar to traditional forward- 



time simulations. 

The apparent intractability of the weak selection 
regime is somewhat paradoxical, given that the limit of 
infinitely weak selection is simply a neutral population. 
Part of this difficulty arises from the fact that existing 
coalescent models of purifying selection explicitly track 
the number of deleterious mutations in each individual. 
An entirely neutral theory that required similar account- 
ing for neutral mutations would be intractable for many 
of the same reasons. However, this superficial difficulty 
obscures a more fundamental aspect of weakly selected 
mutations: individually, they have a negligible impact 
on the ancestral process and are indistinguishable from 
their neutral counterparts, but the accumulation of many 
such mutations can have a significant effect on the overall 
diversity of the sample. 

In the present work, we exploit this separation of scales 
to establish a correspondence between the strong and 
weak selection regimes. By relaxing our definition of neu- 
tral and selected mutations and introducing a rescaled ef- 
fective selection strength, we demonstrate an equivalence 
principle relating the patterns of diversity among popu- 
lations with differing strengths of selection. For a given 
population in the weak selection regime, this defines a 
mapping to a corresponding strong-selection model that 
captures most of the quantitative features of the original 
population. The previously developed strong selection 
results can therefore be extended to provide a single, uni- 
fied theory valid over the entire range of selective effects, 
which provides valuable qualitative insights into the net 
effect of purifying selection. 

This correspondence has obvious practical benefits for 
the analysis of DNA sequence data, since the existing 
strong-selection techniques can generate efficient predic- 
tions across a wide range of parameters, and can poten- 
tially form the basis for self-consistent inference of the 



underlying selective forces and population sizes. These 
results have important qualitative implications as well, 
providing a simple and intuitive alternative to the popu- 
lar (yet flawed) effective population size picture. Rather, 
our correspondence suggests that a more natural local 
quantity is an effective strength of selection, defined over 
some characteristic linkage block. However, the equiva- 
lence between strong and weak selection — and the equiv- 
alence between weakly selected populations themselves 
— suggests an inherent limit to our ability to resolve se- 
lection pressures from contemporary polymorphism data, 
especially at the level of individual sites. 



I. ANALYSIS 

In order to quantify the molecular diversity generated 
by purifying selection at many linked sites, we confine 
our attention to a simple and well-studied model in which 
these effects are known to play a major role. We consider 
a population of N non-recombining haploid individuals 
that accumulate neutral mutations at rate C/„ and suffer 
deleterious mutations with a constant multiplicative fit- 
ness effect s at rate Ud- We assume that the sequences 
in the population are well described by an infinite sites 
model in which each mutation occurs at a unique site 
in the genome, and we neglect compensatory or other- 
wise beneficial mutations. In addition, we work in the 
standard diffusion limit N oo, where the scaled pa- 
rameters Ns, NUd, and NUn are sufficient to determine 
all quantities of interest. 



A. Strong selection 

The behavior of this model has been well-characterized 
when selection against the deleterious alleles is suffi- 
ciently strong (we discuss the exact conditions below). 
In this case, the population reaches a steady state in 
which the continuous influx of deleterious mutations is 
balanced on average by the action of selection against 
them. In the limit Ns — >■ oo where genetic drift can 
be neglected, the expected fraction of individuals with k 
deleterious mutations ( "fitness class fc" ) is given by 



hk - ^e- 



(1) 



where A = Ud/s parameterizes the relative strength of 
mutation and selection (Haigh 19781. An example of 



this distribution is shown in Fig. [Ij As long as Eq. ([T]) 
provides a good approximation to the actual stochastic 
class sizes, the corresponding patterns of diversity are 
equivalent to a demographically structured neutral pop- 
ulation, where the h^. are treated as fixed subpopula- 
tions and deleterious mutations are recast as migration 
between the h^s (see Appendix|A]). This is a special case 
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of the structured coalescent introduced by [Kaplan et al. 
( 1988 ) , which traces the ancestry of a sample as it moves 
through the population fitness distribution (see Fig. [T]) . 

This simplified structured coalescent admits approxi- 
mate analytical calculations for several simple diversity 
statistics. These reduce to th e standard background se - 
lectiorj^ limit when Ns — )■ oo ( Charlesworth et al. 19931, 



2002), or empirical estimates of divergence from the 



in which the linked neutral diversity is equivalent to an 
unstructured neutral population with effective popula- 
tion size 



w Ne 



-A 



(2) 



Non-neutral corrections arise for TVs < oo and require 



more complicated calculations ( Gordo et al. 2002 Nico- 



[laisen and Desai 2012 Walczak et al. 2012). More im 



portantly in practice, the simplified structured coalescent 
can be used to efficiently simulate genealogies in a time 
that scales with the size of the sample rather than the 



size of the population ( Gordo et al. 2002 Hudson and 



Kaplan 1994 1. These coalescent simulations can rapidly 



generate predictions for many statistics of interest, and 
could potentially be leveraged to enable full-scale infer- 
ence of population parameters. 

However, this simplified picture is crucially depen- 
dent on the assumption that the fitness class sizes (and 
hence the mutation and coalescence rates) are effectively 
fixed by the deterministic mutation-selection balance in 
Eq. ([T]). In general, genetic drift will cause the actual 
class sizes to fiuctuate around these deterministic predic- 
tions, so the validity of our assumption will depend on 
the severity of these fluctuations. Given our assumption 
of one-way mutation, there is also a nonzero probability 
that the least-loaded {k = 0) class fluctuates to extinc- 
tion, allowing one of the deleterious alleles to fix. This 
effect is known as Muller's ratchet, and it implies that the 
deterministic mutation-selection balance is stochastically 
unstable. In the absence of compensatory forces, the en- 
tire population will tend to drift toward lower fitness at 
some small but nonzero rate. 

While it is not surprising that these stochastic forces 
eventually cause significant deviations from Eq. ([I]), a 
quantitative characterization of this breakdown is com- 
plicated, and the precise answer depends on the quantity 
of interest. For example, one could examine fluctuations 



in the fitness class sizes (Neher and Shraiman 2012), 



the transition between the so-called "slow" and "fast" 



regimes of Muller's ratchet (Gessler 1995), the break- 



down of the background-selection limit (Gordo et al 



^ There is some ambiguity in the literature regarding the term 
"background selection," specifically whether it refers to the gen- 
eral effects of purifying selection at linked neutral loci or to the 
limiting behavior that arises when selection is extremely strong. 
Here, we use the term in the latter sense as defined by Eq. Q 



structured coalescent (see below). Fortunately, most of 
these definitions of "strong selection" lead to conditions 
of the form A^se"'^ ^ where g{X) is some slowly 

growing function of A. In practice, the simplified condi- 
tion 



Nse^^ > 1 



(3) 



is generally sufficient to ensure the validity of the struc- 
tured coalescent for most quantities of interest. We note, 
however, that even for selection pressures that are tradi- 
tionally considered to be strong {Ns ~ 10), this condi- 
tion will be violated if the mutation rates are high enough 
that many of these mutations segregate in the population 
at the same time (A ^ 1). Thus, the strong-selection 
regime could be more accurately described as a strong- 
selection/weak-mutation regime. 



B. Equivalence Principle for Weak Selection 

As selection grows weaker, the deterministic mutation- 
selection balance provides an increasingly poor estimate 
of the distribution of fitnesses within the population, as 
stochastic fluctuations and Muller's ratchet take on a 
larger role. It is instructive to consider the extreme limit 
where Ns = 0, when these stochastic forces are strongest. 
Although we do not normally visualize a neutral popu- 
lation in this way, we could also partition it into "fitness 
classes" according to the number of mutations in each in- 
dividual. In this case, the resulting fitness classes fiuctu- 
ate wildly on coalescent timescales, and Muller's ratchet 
"clicks" at rate Ud- 

But if Ns is identically zero, this population should 
also be described by the standard neutral coalescent, 
which ignores all of these complicated factors. Instead of 
explicitly tracking the number of mutations in each indi- 
vidual, the neutral coalescent places the entire population 
within a single fitness class of size N, where fluctuations 
can be neglected. This simpliflcation arises because in a 
neutral population, it does not matter which mutations 
are accounted for by the fltness classes and which accu- 
mulate within a fltness class, so long as the total mutation 
rate is preserved. A population with "weakly selected" 
deleterious mutations of effect s = is equivalent to one 
with "strong selection" (s' > 0) if we simultaneously take 
U'd-^0 and [/; ^ f/„ + Ud. 

This correspondence between weak and strong selec- 
tion in the neutral limit is admittedly rather trivial, but 
it suggests that a similar reorganization may hold more 
generally for weak but non-vanishing Ns. Intuitively, 
we seek a coarse-grained version of the fltness distribu- 
tion at some larger scale s' , such that fitness differences 
less than s' are ignored, but clusters of mutations with 
cumulative effect s' are treated as a single, large-effect 
mutation (see Fig. [2]). In addition, we wish to choose s' 
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FIG. 2 An intuitive picture of the coarse-graining proposed 
for weakly-selected populations, where the population fitness 
distribution (only a portion of which is shown here) consists 
of a large number of fitness classes whose sizes fluctuate con- 
siderably. Clusters of several fitness classes are grouped to- 
gether into larger, effective fitness classes separated by fitness 
s. Deleterious mutations within an effective fitness class are 
recast as neutral mutations, and only mutations between ef- 
fective classes are treated as deleterious. 



and the reorganized mutation rates U'^ and in order to 
mimic (as closely as possible) the patterns of diversity in 
the original population. It is not clear that this equiva- 
lent population should exist a priori, and even if it does, 
the new parameters Ns', NU'j^, and NUI^ could poten- 
tially depend on the underlying parameters Ns, NUd, 
and NUn in some complicated way. Nevertheless, we 
demonstrate below that an explicit equivalence principle 
can be obtained from a few simple considerations. 

In our neutral example above, we saw that popula- 
tions with Ns — and various combinations of [/^ and 
Un are equivalent as long as the total mutation rate 
NUtot = NUd + NUn is preserved. For populations with 
Ns > 0, it is reasonable to expect an additional con- 
straint on the overall scale of selection, which was auto- 
matically preserved in the neutral case. This scale is not 
determined by individual selected mutations, but rather 
by the emergent distribution of fitnesses within the pop- 
ulation. Of course, the fitness distribution is difficult 
to characterize in weakly-selected populations precisely 
because of the complicated stochastic effects discussed 
above. And even if a full solution was available, it is 
unlikely that that the fitness distributions for two dif- 
ferent {Ns, NUd) combinations would exactly coincide. 
Fortunately, previous studies suggest that the dominant 
feature of the fitness distribution in the Ns — )■ limit is 



the variance (Good and Desai 2012 O'Fallon et al 



2010), which gives a measure of the typical reproductive 
difference between a random pair of individuals. When 
the full effects of drift are included, the variance in fitness 
within the population is given by 



UdS 



1 



R 

Ud 



(4) 



1978). We calculate this rate in AppendixjB] which com- 
pletely determines a'^{Ns, NUd) as a function of the un- 
derlying parameters Ns and NUd- 

Thus, we propose an equivalence principle between 
weakly selected populations, in which the patterns of di- 
versity are equal when 



NUn + NUd =- NU'd + NU'n , 
a^{Ns,NUd) = (T^{Ns',NUd) 



(5) 



In the simple case where the deterministic approximation 
w UdS is valid, we have U'^ — Ud {s/s'), which has a 
natural interpretation in terms of the coarse-grained fit- 
ness distribution depicted in Fig. [2j It is important to 
note that the equivalence defined by Eq. ([5| is only ap- 
proximately correct, and it is only valid up to some max- 
imum strength of selection where moments of the fitness 
distribution other than start to become important. 
This breakdown is unsurprising: when Nse~^ ^ 1 we 
must recover the strong selection limit, where it is known 
that the least-loaded class plays a much larger role than 
the bulk of the fitness distribution (iCharlesworth et al 



1993 



2012 



|Neher and Shraiman[ |2012[ |Nicolaisen and Desai 



When our equivalence principle holds, Eq. ([5| defines 
an equivalence class of populations indexed by a particu- 
lar value of NUtot and Na, in which different underlying 
parameters Ns, NUd, and NUn nevertheless generate the 
same patterns of diversity. Yet the vast majority of these 
populations lie well beyond the range of validity of exist- 
ing methods like the structured coalescent. In order to 
actually predict the patterns of diversity in these popu- 
lations, we therefore look for a representative population 
{Ns, NUd, NUd) within each equivalence class where se- 
lection is simultaneously weak enough for our equivalence 
principle to hold and yet strong enough for these previ- 
ously developed techniques to be valid. 

From our numerical and analytical studies of the struc- 
tured coalescent (see Appendix [C]) , we have seen that 
the predictions for the mean pairwise coalescence time 
(T2) /N reach a minimum for a particular value of Ns, 
below which the predictions rapidly diverge from the re- 
sults of forward-time simulations (see Fig. [3]d) . It there- 
fore seems reasonable to take Ns to be this minimum 
point, which satisfies 



d{T2 
ds 



0. 



(6) 



The subscript denotes that the fitness variance cr^ is to 
be held constant when taking the derivative. In Ap- 
pendix [Cl we show how this derivative can be calcu- 



lated using the methods outlined in Walczak et al. ( 2012 1 . 



where R is the deleterious substitution rate (Haigh 



The resulting locus of points yields a critical line in the 
{Ns,NUd) plane parameterized by the fitness variance 
{Na)'^, as depicted in Fig. [S^. Each point along this 
critical line corresponds to a coarse-grained model with 
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FIG. 3 (A) Predictions for the mean pairwise coalescence 
time {T2)/N obtained from structured coalescent simulations 
of our coarse-grained model. The solid black line denotes the 
"critical line" {Ns, NUd) defined by Eq. (|6|, while the dashed 
lines to the left and right denote lines of constant No and lines 
of constant A, respectively. (B) A "slice" of this phase plot 
at constant NUd ~ 50. The black squares denote the results 
of forward-time, Wright-Fisher simulations and our coarse- 
grained predictions are shown in solid red. For comparison, 
the solid blue line shows the original structured coalescent 
predictions, while the dashed blue line shows the background 
selection approximation {T2) ~ Ne~^ . 



parameters (A^s, NUd) where the structured coalescent is 
valid, and which can be used to predict the patterns of 
diversity for all weakly selected populations within that 
equivalence class. For any particular set of parameters, 
the corresponding coarse-grained model can be easily cal- 
culated from Eqs. ([s]) and (|6]), with the help of our ex- 
pression for cr^ in Eq. (El). 



II. RESULTS 

In the previous section, we argued that the pat- 
terns of molecular diversity should be equivalent for 
weakly selected populations with the same total muta- 
tion rate and variance in fitness. Within each of these 
equivalence classes, we have also identified a particular 
"coarse-grained" population {Ns,NUd,NUn) where pre- 
vious strong selection methods based on the structured 
coalescent can be applied. This mapping yields explicit 



predictions for various diversity statistics across the full 
range of selection strengths, an example of which is shown 
in Fig. [3^ for the mean pairwise coalescent time {T2)/N. 
Parameters that fall to the right of the critical line (de- 
picted by the solid line in Fig. [3^) lie in the strong selec- 
tion regime where {T2) /N is directly calculated from the 
structured coalescent. The vast majority of these points 
are well-characterized by the background selection limit 
in Eq. ([2]), which implies that the level sets of {T2)/N 
lie along lines of constant A (the dashed-dotted line in 
Fig. [3^). As observed in previous studies, this strong se- 
lection equivalence starts to break down near the critical 
line where the full structured coalescent is required to 



obtain accurate predictions ( Gordo et al. 2002 Walczak 



et al.[ 2012). Those parameter values that lie to the left 



of the critical line are the domain of our coarse-grained 
theory and the corresponding equivalence along lines of 
constant Na (depicted by the dashed line in Fig. [3^). 
We obtain predictions for these populations by applying 
the structured coalescent to the coarse-grained parame- 
ters {Ns, NUd, NUn) calculated using the procedure in 
Appendix [Cj Intuitively, this amounts to tracing the line 
of constant Na back to the corresponding point on the 
critical line in Fig. |3^. 

The accuracy of these predictions rests on two cru- 
cial assumptions, which we verify using forward-time, 
Wright-Fisher simulations (described in Appendix |d]) for 
several important and experimentally relevant diversity 
statistics. First, populations with the same fitness vari- 
ance {NaY should yield similar results for various di- 
versity statistics. Secondly, structured coalescent pre- 
dictions should agree with the results of forward time 
simulations along the critical line {Ns, NUd)- 

As we demonstrate in Figs. |4]and[5j both of these as- 
sumptions are approximately valid across a large range 
of parameter values. In Fig. [4^, we plot the mean pair- 
wise coalescent time (T2) /N obtained from forward-time 
simulations for a large collection of parameters spanning 
several orders of magnitude in Ns and NUd, all of which 
lie to the left of the critical line in Fig. [3^. The results 
are organized by their observed fitness variance along the 
X-axis and colored according to the selection strength of 
the simulated population. Differently colored points at 
the same value of Na represent populations with dif- 
ferent underlying parameters that fall within the same 
predicted equivalence class. If our equivalence principle 
is correct, these colored points should all lie on the same 
line. In addition, we also plot the structured coalescent 
predictions for the coarse-grained parameters {Ns, NUd) 
as a function of Na, which show good agreement with 
both forward-time simulations of the critical line (black 
triangles) and the other populations in each equivalence 
class. 

One of the reasons that the pairwise coalescent time 
(T2) plays such a prominent role in earlier studies is that 
in the standard Hill-Robertson picture, it is equivalent to 
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FIG. 4 (A) The mean pairwise coalescence time {T2)/N 
in the weak-selection regime (top), collated from forward- 
time, Wright-Fisher simulations at five "slices" of constant 
NUd = 1,10,50,100 and 300 (similar to Fig. [3|d), and the 
three "slices" of constant Na = 2, 7, 43 shown in Fig. [5] Each 
simulated population is plotted according to its fitness vari- 
ance No (averaged over the simulation run) and colored ac- 
cording to the underlying selection strength Ns. In addition, 
direct forward-time simulations of the critical line {Ns, NUd) 
are shown as black triangles, while the predictions from the 
structured coalescent are shown in solid red. (B) A similar 
plot for the total number of segregating sites Sn in a sam- 
ple of n = 100 individuals, where the total mutation rate is 
given by NUtot ~ 350. For comparison, the blue line shows 
the predictions obtained by assuming independent evolution 
at different sites, with an effective population size A^e fitted 
from {T2) above. 



the effective population size that supposedly captures all 
of the effects of linked selection. Even when (T2) cannot 
be predicted analytically, it can be measured from the av- 
erage heterozygosity at putatively neutral (e.g. synony- 
mous) sites. Both the selected and non-selected sites are 
then assumed to evolve independently with an effective 
population size = (72). While this intuition works 
well for predicting the average nonsynonymous heterozy- 
gosity Kaiser and Charlesworth ( 2008 1 , see AppendixjC] , 
several studies have shown that it fails for other statistics 
that are more sensitive to the correlations produced by 



linked selection ( 


Comeron and Kreitman 


2002 


Comeron 


et al. 


2008 


Santiago and Caballero 


1998 


I. In Fig. 14b, 



we plot the total number of segregating sites Sn in a 
sample of n = 100 individuals in a similar manner as 
Fig. |4^. We see that even after conditioning on the "cor- 
rect" reduction in Ng = (T2), this independent sites as- 
sumption significantly underestimates the total diversity 



in the sample when Na > 1. By contrast, the structured 
coalescent predictions of the coarse-grained model yield 
accurate results for the full range of parameters, without 
needing to fit to the correct (T2). 

In addition to reducing the overall levels of diversity 
described by statistics like (T2) and 5„, it is also well- 
known that purifying selection alters the relative branch 
lengths in the genealogy of a sample. This distortion is 
typically measured using the polymorphic site frequency 
spectrum or one of its derivatives such as Tajima's D 



( |Tajima[|1989[ ) or Fu and Li's D ( |Fu and Li[|1993[ ). When 
normalized by the total number of singletons, the neutral 
expectation for the frequencies fi of the sites polymorphic 
in i individuals is given by the parameter-free estimate 



fi 



neutral 



(7) 



Purifying selection leads to an increase in rare variants 
{i <C n) compared to this neutral expectation, since dele- 
terious mutations are typically purged before they can 
drift to appreciable frequencies. Unlike (T2) or Sn, the 
site frequency spectrum can be used to detect deviations 
from neutrality without requiring previous estimates of 
population size or mutation rate, and is therefore highly 
useful in the analysis of real sequence data. In Fig. [5] we 
plot the site frequency spectrum for a sample of n = 100 
individuals for a range of populations along three partic- 
ular lines of constant Na and NUtot , but to the left of the 
critical line in Fig. [3^. Again, we see that populations in 
the same equivalence class possess very similar frequency 
spectra, which turn increasingly non-neutral with larger 
Na. We also show the structured coalescent predictions 
of the coarse-grained model for each value of Na, which 
agree quite well with these forward-time simulations. 

However, despite the generally good agreement with 
forward-time simulations, some small systematic errors 
remain. For example, the postulated equivalence be- 
tween populations with the same fitness variance is only 
approximately true. As can be seen in Figs. |4] and [5] pop- 
ulations that are further from the critical line are gener- 
ally slightly "less neutral" than their counterparts closer 
to the critical line, although these differences are often 
dwarfed by the variation along the critical line itself. In 
addition, the accuracy of the structured coalescent along 
the critical line is diminished within a region with low 
Ns and NUd, which leads us to slightly overestimate 
overall levels of diversity in this region. These issues are 
discussed in more detail in Appendix [C] 



III. DISCUSSION 

We have demonstrated an approximate symmetry be- 
tween the patterns of molecular evolution generated by 
weak and strong purifying selection. Weakly selected mu- 
tations have a negligible impact individually, but a suffi- 
ciently large number of these mutations will combine to 
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FIG. 5 The polymorphic site frequency spectrum for a sam- 
ple of n = 100 individuals, collated from populations with 
total mutation rate NUtot ~ 350 and fitness variance Na ~ 2 
(blue), Na « 7 (green), and Na ~ 43 (red). Symbols de- 
note the results of forward-time simulations (which are shaded 
from light to dark with decreasing Ns), while the correspond- 
ing structured coalescent predictions are shown as solid lines. 
For comparison, the prediction for a completely neutral pop- 
ulation is depicted by the dashed line. The inset shows the 
same figure on a log-log scale. 



mimic the effects of a single, stronger mutation whose 
scale is set by the typical fitness differences within the 
population. This correspondence allows us to import a 
large body of theory originally developed for the strong 
selection regime, which we can use to obtain highly accu- 
rate predictions across a much broader range of selection 
strengths than was previously possible. 

Our results are consistent with observations from 



earlier simulation studies of weak selection (Comeron 



and Kreitman 


2002 Gordo et al. 2002 Kaiser and 


Charlesworth 


2008 McVean and Charlesworth 2000 



Seger et al. 2010), but our coarse-grained model offers a 
radically different perspective on the relevant processes 
that contribute to molecular evolution in this regime. 
Previous work has argued that virtually all of the devia- 
tions from the traditional background selection limit can 
be attributed to the effects of MuUer's ratchet (Gordo 

[2010| ), 



et al. 2002 Seger et al 



as well as to the in- 
fluence of weak Hill-Robertson interference, where large 
fluctuations drive weakly selected alleles to intermediate 



frequencies ( Comeron and Kreitman 


2002 


McVean and 


Charlesworth 


2000 


Seger et al. 


2010 


). In contrast, our 



coarse-grained theory includes neither of these complica- 
tions (aside from the corrections to ct^) and still captures 
the quantitative patterns of variation over broad scales. 
This suggests that variance in ancestral fitness — not 
fluctuations or the ratchet — is the driving force behind 
the large-scale patterns of diversity. More complicated 
stochastic effects may be essential for a first-principles 
account of linked selection, or for more exotic parameter 
ranges, but they appear to be of secondary importance 



for the quantities and regimes considered here. 

Our finding that strong selection methods can be used 
across the full range of parameters has important prac- 
tical implications for the inference of selection pressures 
and population sizes from DNA sequence data. Several 
previous studies have uncovered evidence for the role of 



weak purifying selection in natural populations (Barra- 



clough et ai:| |2007 



Charlesworth 



2007 



Betancourt et al. 2009| Loewe and 



Seger et al. 2010 1, but this evidence 



is limited by the difficulty in obtaining proper estimates 
of the population size and selection strength without 
accounting for the influence of selection at neighboring 
sites. Previous analyses either ignore linkage altogether 



(Boyko et al. 


2008 


Hartl et al. 


1994 


Keightley and 


Eyre- Walker 


2007 


Tamuri et al. 


2012 


Williamson et al. 


2005 ) or depend on computationally costly forward-time 



simulations evaluated over a narrow range of parameters 



(Kaiser and Charlesworth 


2008 


Lohmueller et al. 


2011 


Seger et al. 




2010 


). Estimates obtained from the first 



method should be treated cautiously, since the results 
here and elsewhere demonstrate that the independent- 
sites assumption is drastically violated when selection at 
linked sites is common. Although these effects can be 
treated more rigorously in simulations, our present analy- 
sis has uncovered an approximate equivalence or degener- 
acy between populations with weak and strong purifying 
selection. Equivalent populations are distributed along 
low-dimensional "ridges" embedded in the larger space of 
parameters, which can be easily missed when simulating 
a discrete set of parameter values. The degeneracy iden- 
tified here arises in a simple model with only three pa- 
rameters, and it is likely that the number of degeneracies 
will only increase in more complicated models which in- 
volve many more parameters. Thus, our analysis argues 
for a degree of caution when interpreting the estimates 
from simulation studies as well, since it can be difficult 
to determine whether there are other "equivalent" sets 
of parameters which are equally (or only slightly less) 
consistent with the data. Ideally, one could combine the 
knowledge of the degeneracies identified here with the 
more computationally efficient coalescent simulations to 
devise a self-consistent inference scheme that utilizes fre- 
quency spectrum data, or potentially even phylogenetic 
reconstruction (Rambaut et al. 2008). A concrete im- 



plementation is beyond the scope of the present paper, 
but this remains an important avenue for future work. 

The correspondence between weak and strong selection 
also suggests an important qualitative shift in the inter- 
pretation of polymorphism data. It has been known for 
some time that the effects of linked selection are more 
complicated than the traditional picture of independent 
evolution at a reduced effective population size. But 
without a simple alternative, the concept of a local ef- 
fective population size — which influences the efficacy of 
selection and varies along the genome in accordance with 
the neutral heterozygosity — remains a popular means 
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of interpreting genomic data ( Charlesworth 2009 Gross 



mann et al. 20111. In agreement with previous studies, 
we have provided further evidence that a simple reduc- 
tion in effective population size does not lead to con- 
sistent results for any statistic other than the average 
heterozygosity tt, even after fitting A^e to reproduce the 
observed reduction in neutral diversity. This leads us to 
question the ultimate utility of the local effective popu- 
lation size, given that it requires a fit in order to describe 
only one other property of the data. Rather, our coarse- 
grained correspondence suggests that a more suitable lo- 
cal quantity is an effective strength of selection. It addi- 
tion to its increased accuracy in capturing the patterns of 
diversity as demonstrated above, an effective strength of 
selection is a more natural candidate for a local measure 
of linked selection, since different parts of the genome are 
already under varying degrees of selection. 

However, we must be careful when interpreting this lo- 
cal effective selection strength due to the degeneracy in 
the parameter space discussed above. The accuracy of 
the collapse plots in Figs. |4] and [5] hints at a fundamen- 
tal resolution limit for inferring the underlying selection 
pressures from polymorphism data alone, since there is 
little statistical power to differentiate a weakly selected 
population from its coarse-grained counterpart (or in- 
deed, any other weakly selected population with the same 
overall mutation rate and fitness variance). This degen- 
eracy is especially problematic for detecting selection at 
individual sites, given that the coarse-grained mapping 
works by reassigning selection from some mutations to 
others with minimal impact on the overall patterns of 
diversity. 

Of course, our analysis is based on a highly simpli- 
fied model, and additional work will be required to ex- 
tend these results to more biologically realistic scenar- 
ios. Depending on the particular parameter regime in- 
volved, epistasis (Kimura and Maruyama 19661, finite- 



site effects (Desai and Plotkin 20081, and the presence 



of beneficial or compensatory mutations (Goyal et al 



2012) may all play a larger role than we have assumed 
here. Particularly questionable is our assumption that 
all deleterious mutations have the same strength, since it 
is known that deleterious mutations have a wide distri- 



bution of fitness effects in many organisms ( Eyre- Walker 



andKeightley 2007). Nevertheless, the issues raised here 
are likely to be a factor in any model that includes a 
sufficiently large number of weakly-selected mutations. 
At present, there are no analytical descriptions of most 
of these more complicated scenarios even in the strong- 
selection regime. Thus while a coarse-grained model of 
weak selection could be defined in many of these cases, we 
must first understand the corresponding strong-selection 
model before coarse-graining can provide useful analyti- 
cal predictions. 

Possibly more problematic for immediate data analy- 
sis is our neglect of recombination in the history of the 



sample, which limits the direct applications of our theory 
to asexual organisms, or to mitochondrial DNA or non- 
recombining regions of the genome in sexual populations. 
The qualitative issues here remain important in the pres- 
ence of recombination, since the diversity at each site is 
influenced by the aggregate selection within some non- 
recombining neighborhood around it. However, a quan- 
titative extension of these ideas is difficult, since selection 
and recombination jointly determine the typical linkage 
scale and the resulting selection regime within that region 



( Comeron and Kreitman 


2002 


Kaiser and Charlesworth 


2008 


McVean and Charlesworth 


2000 


I. We note how- 


ever that recent work by 


Zeng and Charlesworth 


(2011) 



has incorporated finite but nonzero recombination rates 
into the structured coalescent framework. Thus when 
an analogous structured coalescent can be defined, our 
coarse-graining picture will likely be useful for under- 
standing weak selection in these regimes as well. This 
remains an important avenue for future work. 
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2. An individual in class fc^ > can experience a dele- 
terious mutation (thus transferring it to class fc^ — 1) 
at rate 

N\-^^^)^Nsk. (A3) 

3. Two individuals in the same fitness class k = ki = 
kj can coalesce to a single individual at rate 



Appendix A: The Structured Coalescent 

Our theoretical predictions for the diversity statistics 
in the main text are calculated within the structured coa- 
lescent framework, which provides an explicit probabilis- 
tic model of the genealogy of a sample from the popula- 
tion. In its most general form, the structured coalescent 
extends the neutral Kingman coalescent to incorporate 
arbitrary time-dependent (and possibly stochastic) de- 
mographic structure, but this has proven difficult to im- 
plement in practice. In the present work, we therefore 
focus our attention on a particularly simple special case, 
where the relevant demographic structure is the division 
of the population into constant fitness classes attained at 



mutation-selection balance (Hudson and Kaplan 1994). 



As such, this simplified structured coalescent is funda- 
mentally a strong selection result. 

In the analysis that follows, we work in the standard 
coalescent limit — >■ cx), where the scaled parameters 
Ns, NUd, and NUn are sufficient to characterize the pop- 
ulation, and we measure time in units of N generations. 
As mentioned above, we assume that the distribution of 
fitnesses within the population is given by the determin- 
istic mutation-selection balance in Eq. ([T]) in the main 
text, and we neglect fluctuations in the class sizes. 

We wish to characterize the possible genealogical his- 
tories of a sample of n individuals drawn from the popu- 
lation. In a random sample, these individuals come from 
fitness classes fci , . . . , fc„ drawn from the population fit- 
ness distribution, which implies that 

fci, . . . , fc„ "^"^^ Poisson (A = NUd/Ns) . (Al) 

We then trace the genealogy back to the most recent 
common ancestor of the sample. At any given instant, 
three types of ancestral events can occur: 

1. An individual can experience a neutral mutation at 
rate 



These events are competing Poisson processes, which im- 
plies that the time to the next event is exponentially 
distributed with mean equal to the reciprocal of the sum 
of the rates of all possible events. The event itself is then 
drawn randomly from the pool of possible events, each 
weighted by its corresponding rate. This process contin- 
ues until the sample has coalesced into a single lineage, 
which is the most recent common ancestor of the sample. 

Thus, for a given set of parameters Ns, NUd, and 
NUn, the distribution of genealogies and mutation events 
for a sample of n individuals is completely specified. The 
distribution of any particular diversity statistic can be 
straightforwardly obtained by averaging over the distri- 
bution of genealogies. In practice, evaluating these av- 
erages analytically can be difficult for all but the sim- 
plest statistics (Walczak et al. 2012). Instead, it is of- 



ten easier to use the procedure outlined above to im- 
plement backward-in-time simulations that sample ge- 



nealogies from this ancestral process ( Gordo et al. 2002 ) . 



These coalescent simulations are extremely computation- 
ally efficient compared to their ordinary forward-time 
counterparts, since we only need to simulate ancestral 
events for the sample as opposed to simulating the entire 
population at each generation. A copy of our implemen- 
tation in C is available upon request. 



Appendix B: Fitness variance under weak selection 

In the main text, we postulate an equivalence principle 
between weakly selected populations with the same vari- 
ance in fitness, which we calculate here. When selection 
is strong, cr^ can easily be calculated from the mutation- 
selection balance in Eq. ([ij in the main text, and we find 
that 



a' - UdS . 



(Bl) 



(A2) 



However, we wish to apply these results precisely in the 
region where the deterministic mutation-selection bal- 
ance becomes unreliable, so we would like to general- 
ize this calculation to include the full effects of drift as 
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Ns — )■ 0. A standard calculation shows that more gener- 
ally, we have 

R 



In order to obtain a non-trivial solution for w{x), we must 
include at least the second order term in this expansion, 



(j' = Uds 1 



(B2) 



after which Eqs. (B3) and (B4) can be rewritten in the 
form 



where R is the substitution rate of deleterious muta- 



tions^! Ethca^dgeet^^ 2007 Good and Desai 2012 



Haigh 1978 Higgs and Woodcock 1995). This reduces 



= xf{x) - w{x)f{x) 

+ (^)d'j{x) + UdS 



to the ordinary strong-selection result in Eq. (Bl) when 
Nse~^ 3> 1, but we now focus on the regime where R 
is not necessarily small compared to Ud- Fortunately, 
when R is on the order of Ud, this substitution rate is 
equivalent to the rate of Muller's ratchet in the so-called 



xw{x) — w{x)'^ 
'UdS^ 



d^w{x) — UdS 



R 

Ud 



U~d 



dxw{x) . 



(B7) 



(B8) 



"fast-ratchet" regime (jGessler) |1995|) , where many of the Note that this is essentially the same model analyzed in 



complicated stochastic aspects of the ratchet analyzed 
in Neher and Shraiman (2012) can be neglected and re- 



suits from traveling wave theory ( 


Goodetal. 2012 


Hal- 


latschek 


2011 


Rouzine et al. 


2008 


I can be applied. 



We calculate the rate of deleterious substitutions us- 
ing the tunable constraint framework introduced in |Hal- 
latschek (20111, which modifies the standard Wright- 



Fisher stochastic dynamics in order to make R easier to 
calculate. These predictions for the substition rate have 
been shown to agree with ordinary Wright-Fisher simula- 



tions in several regimes of positive selection ( Good et al 



2012 Hallatschek 2011 ), but they have yet to be directly 
applied to the purifying selection regime studied here. 

We introduce two new quantities f{x) and w{x), which 
respectively correspond to the population density and the 
fixation probability of new mutants at relative fitness x. 
In the tunable constraint framework, these are related 
to the population size and substitution rate through the 
system of equations 



sRdxfix) = xf{x) - f{x)w{x) 

+ Ud [f{x + s) - f{x)] , 



—sRdxw{x) = xw{x) — w{xY 
+ Ud [w{x - s) ^ 

and the normalization conditions 



(B3) 



(B4) 



1 



f{x)dx, 



V 

N 



f{x)w{x)dx, (B5) 



where v is the variance in offspring number (equal to 
unity in the Wright-Fisher model) . This system of equa- 
tions can in principle be solved to obtain i? as a func- 
tion of A^, s, and Ud, but in their current form, these 
equations are difficult to solve (even numerically) due 
the delay terms f{x + s) and w{x — s) in the differential 
equations. Thus, we turn to an approximate solution. 

Since we are focused on a regime where selection is 
weak, it seems reasonable to try a Taylor expansion of 
these delay terms in powers of s: 

fix + s) = fix) + sdxfix) + \dlfix) + ... (B6) 



(Hallatschek 2011 1, with the parameters D and v of that 
work corresponding to UdS^ /'^ and Udsil — R/Ud) here. 
A similar analysis can therefore be applied in the present 
case. We first rescale the relative fitness x by introducing 
the new coordinate 



X 



Uds' 



-1/3 



(B9) 



Since wix) and fix) currently have the units of fitness 
and inverse fitness, respectively, we must rescale these as 
well: 



/>) 



wix) 



US 
2 



fUdS 



\ 2 



1/3 
-1/3 



(BIO) 



wix) . 



In terms of these rescaled variables, our system of equa- 
tions can be written in the compact form 



dlf + adj + xf-fw. 



d^w — ad^w + xu! 



w\ 



1 = 
1 _ 



fix) dx , 
fix)wix)dx, 



where we have introduced the two parameters 

1/3 



= (4A)i/3 1 - 



R 



Ns 



From inspection, we can immediately see that 

fix) oc e~"^w(x) 



(Bll) 
(B12) 

(B13) 
(B14) 

(B15) 
(B16) 



is a solution to Eq. ( |B11 ), which allows us to eliminate 
/ entirely and yields the simplified system 



= dlw 



/3 = 



ad^w + xw 

I^oo e""^w(x) dx 
JZe~-x^ix)^dx' 



w\ 



(B17) 
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FIG. 6 The variance in fitness within the population as a 
function of the selection strength, for NUd ~ 5 (blue), NUd = 
50 (green), and NUd = 500 (red). Symbols denote the results 
of forward-time simulations with A'^ = 10* averaged over 300 
independent runs, while the solid lines are the predictions 
obtained from Eqs. (B2| and (B23I. The dashed lines give 
the asymptotic behavior for Ns — >■ and Ns — > oo. 



where w{x) is subject to the boundary conditions 
w{x) — as X ^ —CO and w{x) — )■ x as x — ^ oo. Thus, 
numerical solution of the boundary value problem (for 
instance, using Matlab's bvp4c function) and subsequent 
numerical integration of the resulting w allows us to cal- 
culate /3 as a function of a, 



(B18) 



where g is independent of any of the evolutionary param- 
eters. A subsequent inversion of this relation yields an 
expression for 1 — R/Ud as a function of Ns and A: 



R 

Ud 



1/3 



Ns 

V 



1/3' 



(B19) 



For general values of a, this function g(a) must be cal- 
culated numerically using the approach outlined above 
(an implementation in Matlab is available upon request). 
However, we can obtain simple analytical formulae in two 
limiting cases. The limit of large a has been studied 
extensively in previous work (iGoyal et al. 2012 Hal 
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Tsimring et al. 1996), and we find that 



log g(a) ^ a'^. In the limit a — )■ 0, the differential equa- 
tion for w reduces to the parameter-free form 



= d'^w -f xw — •up' , 



(B20) 



whose solution can be reasonably well-approximated in 
this limit by 



w{x) 



X ifx>0, 
else. 



(B21) 



gipi) ~ §. In terms of the underlying evolutionary pa- 
rameters, this implies that 



(B22) 



as N s 0, which agrees with an analogous calculation 
using the neutral coalescent. 

As a slight technical aside, we note that the left-hand 



side of Eq. (B19) by definition cannot be greater than 
one, since the deleterious mutations cannot accumulate 
at a negative rate. Nevertheless, as N s increases the 



right-hand side of Eq. (B19l eventually becomes larger 
than one (particularly in the large a limit discussed 
above). This likely indicates a breakdown of the muta- 



tional diffusion approximation we assumed in Eq. ( B6 1 , 



or possibly a breakdown in the applicability of the tun- 
able constraint framework in general. In order to main- 
tain sensible results, we therefore take 
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(B23) 

In Fig. [6] we compare these predictions with the results 
of forward-time simulations for a representative sample 
of parameters. The agreement is generally quite good 
(certainly better than the naive asymptotics alone), al- 
though there are some small systematic disagreements. 
In particular, we tend to slightly overestimate the vari- 
ance in fitness at the point where it starts to deviate 
from the deterministic asymptote = UdS, and we tend 
to slightly underestimate it during the transition to the 
neutral asymptote cr^ = NUdS^. 



Appendix C: The Coarse-Grained Model 

In the main text, we proposed a theory of weak purifying 
selection, which was based on two underlying assump- 
tions: 

1. Weakly selected populations with the same NU tot 
and Na form an equivalence class in terms of the 
patterns of diversity they contain. 

2. Within this equivalence class, there exists a popu- 
lation with parameters (iVs, NUd, NUn) where the 
strength of selection is strong enough that the 
structured coalescent is valid. 

This allows us to generate predictions for any weakly 
selected population by identifying the corresponding 
"coarse-grained" population and applying the structured 
coalescent. 



1. Finding the coarse-grained parameters 



The normalization integrals in Eq. (B17) can then there- 



fore be approximated by F-functions, and we find that 



For a particular population with parameters Ns, NUd, 
and NUn, we first identify the corresponding equiva- 
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lence class by calculating Na using the formulae in Ap- 
pendix [B] Then, as described in the main text, the cor- 
responding coarse-grained population can be found by 
minimizing {T2)/N as a function of Ns, with Na held 
constant. This coarse-grained population is by definition 
in the strong selection regime, so {T2)/N can be calcu- 



lated analytically using the results in (Walczak et al 



2012), which state that 

CO oo ki 

H H X] i(fci,fc2,A:eMfci,fc2,A:e). (CI) 
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fei=0 fc2 = fei A,v=0 



The function t{ki, ^2, fcc) is the mean coalescent time for 
two individuals in classes ki and ^2 that coalesce in class 
he, and is given by 



in the special case that ki — k2 — kc and 

j{-iy~^ fk. 



(C2) 
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otherwise. The function p{ki,k2,kc) is the probability 
of sampling two individuals from classes fci and fc2 that 
coalesce in class k,, and is given by 



p{ki,k2,kc) 



2fce 



h{ki,k2 
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(C4) 



where 



h{k,,k2) = { 



if fcl 

else. 



(C5) 



In addition, when Eq. (CI I is valid, the fitness variance 



is related to Ns and NUd through the simple relation 

{Naf ^{NUa){Ns). (C6) 

Putting these two facts together, we see that Ns is given 
by the root of the equation 
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(C7) 



d{Ns) Ns dX 

where we set A = [Na/Ns)'^ after taking the derivative. 
Once we have determined Ns, the corresponding muta- 
tion rates NUd and NUn are given by 

{Naf 

(C8) 



NUd^ 



Ns 



NUr, = iV[/„ + [NUd - NUd) 




FIG. 7 The mean pairwise coalescence time as a function 
of the selection strength, for fixed Na ~ 2 (blue), Na ~ 7 
(green), and Na ~ 43 (red). Symbols denote the results of 
forward-time simulations, while the solid fines show the pre- 
dictions from the original structured coalescent. The dotted 
lines denote the corresponding points {Ns, NUd) on the crit- 
ical line, which are utilized by our coarse-grained theory. 



Thus, we have constructed an explicit mapping between 
underlying parameters {Ns, NUd, NUn) and those of 
the corresponding coarse-grained model {Ns, NUd, NUn) 
where we can apply the structured coalescent. A copy of 
our implementation in Python is available upon request. 



2. Deviations from the coarse-grained predictions 

This procedure enables us to generate predictions for 
any set of parameters Ns, NUd, and NUn, but the va- 
lidity of these predictions depends on the validity of the 
underlying assumptions (1) and (2) stated above. Figure 
3 (main text) shows that these assumptions are approx- 
imately true over a broad parameter regime, but some 
small systematic deviations are observed. In order to 
examine these deviations in more detail, we focus on a 
narrower (yet still representative) set of parameters. 

In the same manner as Figure 4 in the main text, we 
simulate three lines of constant Na and NUtot [calculated 
from Eqs. (|B2]) and ( |B23[ )] for Na ^ 2, Na ^ 7, and 
Na w 43. Results for the mean pairwise coalescent time 
are shown in Fig. [7] In all three cases, we observe a 
characteristic "hockey-stick" shape, in which a region of 
rapidly varying {T2)/N sharply transitions to a region 
of significantly reduced variation. This abrupt transition 
coincides with the minimum of the structured coalescent 
predictions for (T2) /N, and hence with the boundary of 
the strong selection regime. 

If assumption (1) is exactly satisfied, the points to the 
left of this boundary should all have the same value. We 
see that this is true to a good degree of approximation, 
in the sense that the remaining variation in these points 
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FIG. 8 The mean pairwise heterozygosity (top) and the total 
number of segregating sites for a sample of n = 100 individ- 
uals (bottom) as a function of selection strength for the lines 
of constant Na shown in Fig. [7] with NUtot = 350. Symbols 
denote the results of forward-time simulations, while the solid 
lines show the predictions from the effective population size 
assumption, fitted to the simulated values of {T2) in Fig. [7] 
The dotted lines denote the corresponding points (Ns, NUd) 
on the critical line, which are utilized by our coarse-grained 
theory. 



is much less than the variation between the lines with 
different Na. Nevertheless, there remains a slight down- 
ward trend along these lines of constant Na as selection 
grows weaker, which corresponds to these points being 
slightly "less neutral" than we would predict using our 
equivalence class. In Fig. [8] we plot the mean pairwise 
heterozygosity and the total number of segregating sites 
for these lines as well. We observe a similar "hockey 
stick" shape, although the deviation from assumption (1) 
is slightly stronger than we observed for {T2)/N. 

Our coarse-grained theory also depends on the validity 
of assumption (2), which requires that the structured co- 
alescent predictions along the critical line should match 
forward-time simulations without any further modifica- 
tions. Again, while Figs. |4] and [5] in the main text show 
that this is generally true, we do observe some system- 
atic deviations, especially for points where both Ns and 
NUd are small. We can examine this regime more closely 
in Fig. [9] which plots the pairwise coalescent time as a 
function of the selection strength for constant NUd ~ 1- 

We observe that at these low values of A = NUd/Ns, 




FIG. 9 The mean pairwise coalescent time (top) and the total 
tree length for a sample of n = 100 individuals (bottom) as 
a function of selection strength, for fixed NUd ~ 1. Symbols 
denote the results of forward-time simulations, and the black 
dashed lines give approximate 95% confidence intervals. The 
solid blue line shows the predictions from the original struc- 
tured coalescent, while the dashed blue line is the correspond- 
ing background selection approximation. 



the structured coalescent overestimates the characteris- 
tic minimum value of (T2), although it gets the location 
more or less correct. Typically, these low values of A are 
associated with extremely strong selection pressures, so 
that the classic background selection approximation is 
valid. However, we see that near this minimum - which 
represents the maximum deviation from neutrality for 
this level of mutation - neither background selection nor 
the structured coalescent gives the correct result. Our 
coarse-grained theory can therefore do no better. 

Further study of this low NUd and low Ns region may 
shed light on the interactions between stochastic fluctua- 
tions and the structured coalescent framework, and could 
offer insight on how to incorporate first order corrections 
for these effects. However, these deviations are generally 
small, and for such low values of Ns and NUd the pop- 
ulation is nearly neutral anyway. In addition, small dis- 
crepancies in this relatively narrow region of parameter 
space are unlikely to matter much for practical purposes, 
since the patterns of diversity observed in actual popu- 
lations are likely to be dominated by larger deviations 
from neutrality attained at larger values of Ns and NUd- 
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FIG. 10 The distribution of neutral pairwise heterozygosity 
for a population with Ns — 5 (left) and Ns = 0.25 (right), 
with NUd ~ NUn = 50. Black lines denote the results of 
forward-time simulations and the red lines show the struc- 
tured coalescent predictions for our coarse-grained theory. For 
comparison, the blue lines show the predictions from the stan- 
dard effective population size picture, with A^'e fitted from the 
mean of the forward-time distribution. 
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FIG. 11 The relative variance in pairwise coalescent time as 
a function of Na for the same populations as Figure 3 in 
the main text. Again, symbols denote the results of forward- 
time simulations (black dotted lines show approximate 95% 
confidence intervals), while the solid red line gives the struc- 
tured coalescent predictions of our coarse-grained theory. For 
comparison, the predictions from the standard effective pop- 
ulation size picture are shown as a solid blue line. 



3. Comparison with the effective population size picture 

The diversity statistics in Fig. [8] allow us to make a 
more detailed comparison with the efFcctive population 
size picture that is typically used to interpret such data. 
In this case, can be measured exactly by fitting to 
the corresponding (T2) values in Fig. [7| and the pre- 
dictions for TT and S'„ (shown as solid lines in Fig. [s]) 
follow from assuming that the individual sites otherwise 
evolve independently at this reduced effective population 
size. We see that while this assumption leads to excel- 



lent agreement for tt (as observed previously in (Kaiser 



and Charlesworth 2008)), it drastically underestimates 
Sn, which depends more sensitively on the genealogical 
distortions caused by purifying selection. 

We can find even larger discrepancies by focusing on 
the distributions of certain statistics, — particularly 
those related to the neutral diversity — which take on 
an extremely simple form in the effective population size 
picture. For example, under this assumption the distri- 
bution of neutral pairwise heterozygosity is predicted to 
follow a geometric distribution with mean 2NJJn. The 



most likely value is 7r„ = 0, and the probability of larger 
values decreases monotonically with increasing 7r„. In 
Fig. [lOj we plot the distribution of neutral heterozygos- 
ity for two populations in the weak selection regime. We 
see that this effective population size assumption fails to 
capture the qualitative features of the distribution, de- 
spite being fitted to the correct mean value. This effect 
is exaggerated closer to the critical line, where the distri- 
bution develops a strong peak at a nonzero value of 7r„ 
resulting from a corresponding peak in the distribution 
of T2. 

We can quantify this peaked nature of the distribu- 
tion over a broader range of parameters by looking at 
the variance in the pairwise coalescent time. Under the 
effective population size assumption, T2 follows an expo- 
nential distribution with mean Nf.. This implies that the 
ratio of the variance and the mean is given by 



Var(r2) 



= 1 . 



(C9) 



independent of N,, or any of the other parameters. In 
Fig. 11 we measure this statistic for each of the pop- 



ulations in Fig. [4] (main text) and construct an analo- 
gous collapse plot. Again, our equivalence principle is 
highly accurate, and our coarse-grained predictions from 
the structured coalescent quantitatively describe these 
effects of linked selection that are not even qualitatively 
captured by the effective population size picture. 



Appendix D: Forward-time Simulations 

We validate several of our key approximations in 
the main text by comparing our theoretical predictions 
with the results of forward-time, discrete-generation sim- 
ulations similar to the standard Wright-Fisher model 
(jEwens 



2004) 



These simulations begin with a clonal 
population of N individuals, and in each subsequent gen- 
eration the population undergoes a selection step fol- 
lowed by a mutation step. In the selection step, each 
lineage (i.e. unique genotype) is assigned a new size from 
a Poisson distribution with mean 



K ^ C(l + Xi)7 



(Dl) 



where rii is the current size of the lineage, Xi is its fitness 
relative to the population average, and C = N/ Ui is 
a normalization constant chosen to ensure that the total 
population size remains close to A^. In the mutation step, 
each individual mutates with probability + Un, and 
if it does, the new mutation is deleterious with probabil- 
ity Ud/[Ud + Un)- This process is continued for a suffi- 
ciently long period of time that the population reaches 
the steady-state mutation-selection balance introduced in 
the main text, and several population- wide coalescence 
events have occurred. A copy of our implementation in 
C is available upon request. 



